discrete variational autoencoder
DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors
Boltzmann machines are powerful distributions that have been shown to be an effective prior over binary latent variables in variational autoencoders (VAEs). However, previous methods for training discrete VAEs have used the evidence lower bound and not the tighter importance-weighted bound. We propose two approaches for relaxing Boltzmann machines to continuous distributions that permit training with importance-weighted bounds. These relaxations are based on generalized overlapping transformations and the Gaussian integral trick. Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors. An implementation which reproduces these results is available.
Reviews: DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors
A key aspect of these works is retaining the ability to train the models with low-variance reparameterisation trick based gradient estimates of the variational objective by relaxing the discrete latent variables with associated continuous valued variables. Of particular significance to this submission are the discrete VAE (dVAE) (Rolfe, 2016) and dVAE (Vahdat et al., 2018) models which use a Boltzmann machine (BM) prior on the discrete latent variables and construct a differentiable proxy variational objective by introducing continuous variables zeta corresponding to relaxations of the discrete variables z, with \zeta depending on z via a *smoothing* conditional distribution r(\zeta z) . The generative process in the decoder model is specified such that generated outputs x are conditionally independent of the discrete variables z given the continuous variables \zeta . An issue identified with the (differentiable proxy) variational objective used in both the dVAE and dVAE approaches is that it is not amenable to being formulated as an importance-weighted bound, with importance-weighted objectives for continuous VAE models having been found to give significant improvements in training performance (Burda et al., 2015). In this submission the authors suggest an alternative dVAE formulation they term dVAE# which is able to use an importance weighted objective.
CaloDVAE : Discrete Variational Autoencoders for Fast Calorimeter Shower Simulation
Abhishek, Abhishek, Drechsler, Eric, Fedorko, Wojciech, Stelzer, Bernd
Calorimeter simulation is the most computationally expensive part of Monte Carlo generation of samples necessary for analysis of experimental data at the Large Hadron Collider (LHC). The High-Luminosity upgrade of the LHC would require an even larger amount of such samples. We present a technique based on Discrete Variational Autoencoders (DVAEs) to simulate particle showers in Electromagnetic Calorimeters. We discuss how this work paves the way towards exploration of quantum annealing processors as sampling devices for generation of simulated High Energy Physics datasets.
DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors
Vahdat, Arash, Andriyash, Evgeny, Macready, William
Boltzmann machines are powerful distributions that have been shown to be an effective prior over binary latent variables in variational autoencoders (VAEs). However, previous methods for training discrete VAEs have used the evidence lower bound and not the tighter importance-weighted bound. We propose two approaches for relaxing Boltzmann machines to continuous distributions that permit training with importance-weighted bounds. These relaxations are based on generalized overlapping transformations and the Gaussian integral trick. Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors.
DVAE++: Discrete Variational Autoencoders with Overlapping Transformations
Vahdat, Arash, Macready, William G., Bian, Zhengbing, Khoshaman, Amir
Training of discrete latent variable models remains challenging because passing gradient information through discrete units is difficult. We propose a new class of smoothing transformations based on a mixture of two overlapping distributions, and show that the proposed transformation can be used for training binary latent models with either directed or undirected priors. We derive a new variational bound to efficiently train with Boltzmann machine priors. Using this bound, we develop DVAE++, a generative model with a global discrete prior and a hierarchy of convolutional continuous variables. Experiments on several benchmarks show that overlapping transformations outperform other recent continuous relaxations of discrete latent variables including Gumbel-Softmax (Maddison et al., 2016; Jang et al., 2016), and discrete variational autoencoders (Rolfe 2016).